智能论文笔记

Transformer in Transformer as Backbone for Deep Reinforcement Learning

Hangyu Mao , Rui Zhao , Hao Chen , Jianye Hao , Yiqun Chen , Dong Li , Junge Zhang , Zhen Xiao

分类：机器学习 | 人工智能 | 机器人

2022-12-30

Designing better deep networks and better reinforcement learning (RL) algorithms are both important for deep RL. This work focuses on the former. Previous methods build the network with several modules like CNN, LSTM and Attention. Recent methods combine the Transformer with these modules for better performance. However, it requires tedious optimization skills to train a network composed of mixed modules, making these methods inconvenient to be used in practice. In this paper, we propose to design \emph{pure Transformer-based networks} for deep RL, aiming at providing off-the-shelf backbones for both the online and offline settings. Specifically, the Transformer in Transformer (TIT) backbone is proposed, which cascades two Transformers in a very natural way: the inner one is used to process a single observation, while the outer one is responsible for processing the observation history; combining both is expected to extract spatial-temporal representations for good decision-making. Experiments show that TIT can achieve satisfactory performance in different settings, consistently.

translated by 谷歌翻译

VSVC: Backdoor attack against Keyword Spotting based on Voiceprint Selection and Voice Conversion

Hanbo Cai , Pengcheng Zhang , Hai Dong , Yan Xiao , Shunhui Ji

分类：人工智能 | 机器学习

2022-12-20

Keyword spotting (KWS) based on deep neural networks (DNNs) has achieved massive success in voice control scenarios. However, training of such DNN-based KWS systems often requires significant data and hardware resources. Manufacturers often entrust this process to a third-party platform. This makes the training process uncontrollable, where attackers can implant backdoors in the model by manipulating third-party training data. An effective backdoor attack can force the model to make specified judgments under certain conditions, i.e., triggers. In this paper, we design a backdoor attack scheme based on Voiceprint Selection and Voice Conversion, abbreviated as VSVC. Experimental results demonstrated that VSVC is feasible to achieve an average attack success rate close to 97% in four victim models when poisoning less than 1% of the training data.

translated by 谷歌翻译

DeepJoin: Joinable Table Discovery with Pre-trained Language Models

Yuyang Dong , Chuan Xiao , Takuma Nozawa , Masafumi Enomoto , Masafumi Oyamada

分类：人工智能 | 机器学习

2022-12-15

Due to the usefulness in data enrichment for data analysis tasks, joinable table discovery has become an important operation in data lake management. Existing approaches target equi-joins, the most common way of combining tables for creating a unified view, or semantic joins, which tolerate misspellings and different formats to deliver more join results. They are either exact solutions whose running time is linear in the sizes of query column and target table repository or approximate solutions lacking precision. In this paper, we propose Deepjoin, a deep learning model for accurate and efficient joinable table discovery. Our solution is an embedding-based retrieval, which employs a pre-trained language model (PLM) and is designed as one framework serving both equi- and semantic joins. We propose a set of contextualization options to transform column contents to a text sequence. The PLM reads the sequence and is fine-tuned to embed columns to vectors such that columns are expected to be joinable if they are close to each other in the vector space. Since the output of the PLM is fixed in length, the subsequent search procedure becomes independent of the column size. With a state-of-the-art approximate nearest neighbor search algorithm, the search time is logarithmic in the repository size. To train the model, we devise the techniques for preparing training data as well as data augmentation. The experiments on real datasets demonstrate that by training on a small subset of a corpus, Deepjoin generalizes to large datasets and its precision consistently outperforms other approximate solutions'. Deepjoin is even more accurate than an exact solution to semantic joins when evaluated with labels from experts. Moreover, when equipped with a GPU, Deepjoin is up to two orders of magnitude faster than existing solutions.

translated by 谷歌翻译

LEAD: Liberal Feature-based Distillation for Dense Retrieval

Hao Sun , Xiao Liu , Yeyun Gong , Anlei Dong , Jian Jiao , Jingwen Lu , Yan Zhang , Daxin Jiang , Linjun Yang , Rangan Majumder

分类：自然语言处理

2022-12-10

Knowledge distillation is often used to transfer knowledge from a strong teacher model to a relatively weak student model. Traditional knowledge distillation methods include response-based methods and feature-based methods. Response-based methods are used the most widely but suffer from lower upper limit of model performance, while feature-based methods have constraints on the vocabularies and tokenizers. In this paper, we propose a tokenizer-free method liberal feature-based distillation (LEAD). LEAD aligns the distribution between teacher model and student model, which is effective, extendable, portable and has no requirements on vocabularies, tokenizer, or model architecture. Extensive experiments show the effectiveness of LEAD on several widely-used benchmarks, including MS MARCO Passage, TREC Passage 19, TREC Passage 20, MS MARCO Document, TREC Document 19 and TREC Document 20.

translated by 谷歌翻译

CSQ: Growing Mixed-Precision Quantization Scheme with Bi-level Continuous Sparsification

Lirui Xiao , Huanrui Yang , Zhen Dong , Kurt Keutzer , Li Du , Shanghang Zhang

分类：计算机视觉

2022-12-06

Mixed-precision quantization has been widely applied on deep neural networks (DNNs) as it leads to significantly better efficiency-accuracy tradeoffs compared to uniform quantization. Meanwhile, determining the exact precision of each layer remains challenging. Previous attempts on bit-level regularization and pruning-based dynamic precision adjustment during training suffer from noisy gradients and unstable convergence. In this work, we propose Continuous Sparsification Quantization (CSQ), a bit-level training method to search for mixed-precision quantization schemes with improved stability. CSQ stabilizes the bit-level mixed-precision training process with a bi-level gradual continuous sparsification on both the bit values of the quantized weights and the bit selection in determining the quantization precision of each layer. The continuous sparsification scheme enables fully-differentiable training without gradient approximation while achieving an exact quantized model in the end.A budget-aware regularization of total model size enables the dynamic growth and pruning of each layer's precision towards a mixed-precision quantization scheme of the desired size. Extensive experiments show CSQ achieves better efficiency-accuracy tradeoff than previous methods on multiple models and datasets.

translated by 谷歌翻译

Syntax-Guided Domain Adaptation for Aspect-based Sentiment Analysis

Anguo Dong , Cuiyun Gao , Yan Jia , Qing Liao , Xuan Wang , Lei Wang , Jing Xiao

分类：人工智能

2022-11-10

Aspect-based sentiment analysis (ABSA) aims at extracting opinionated aspect terms in review texts and determining their sentiment polarities, which is widely studied in both academia and industry. As a fine-grained classification task, the annotation cost is extremely high. Domain adaptation is a popular solution to alleviate the data deficiency issue in new domains by transferring common knowledge across domains. Most cross-domain ABSA studies are based on structure correspondence learning (SCL), and use pivot features to construct auxiliary tasks for narrowing down the gap between domains. However, their pivot-based auxiliary tasks can only transfer knowledge of aspect terms but not sentiment, limiting the performance of existing models. In this work, we propose a novel Syntax-guided Domain Adaptation Model, named SDAM, for more effective cross-domain ABSA. SDAM exploits syntactic structure similarities for building pseudo training instances, during which aspect terms of target domain are explicitly related to sentiment polarities. Besides, we propose a syntax-based BERT mask language model for further capturing domain-invariant features. Finally, to alleviate the sentiment inconsistency issue in multi-gram aspect terms, we introduce a span-based joint aspect term and sentiment analysis module into the cross-domain End2End ABSA. Experiments on five benchmark datasets show that our model consistently outperforms the state-of-the-art baselines with respect to Micro-F1 metric for the cross-domain End2End ABSA task.

translated by 谷歌翻译

Point Normal Orientation and Surface Reconstruction by Incorporating Isovalue Constraints to Poisson Equation

Dong Xiao , Zuoqiang Shi , Siyu Li , Bailin Deng , Bin Wang

分类：计算机视觉

2022-09-30

Oriented normals are common pre-requisites for many geometric algorithms based on point clouds, such as Poisson surface reconstruction. However, it is not trivial to obtain a consistent orientation. In this work, we bridge orientation and reconstruction in implicit space and propose a novel approach to orient point cloud normals by incorporating isovalue constraints to the Poisson equation. Our key observation is that when using a point cloud with consistently oriented normals as the input for implicit surface reconstruction, the indicator function values of the sample points should be close to the isovalue of the surface. Based on this observation and the Poisson equation, we propose an optimization formulation that combines isovalue constraints with local consistency requirements for normals. We optimize normals and implicit functions simultaneously and solve for a globally consistent orientation. Thanks to the sparsity of the linear system, our method can work on an average laptop with reasonable computational time. Experiments show that our method can achieve high performance in non-uniform and noisy data and manage varying sampling densities, artifacts, multiple connected components, and nested surfaces.

translated by 谷歌翻译

PROD: Progressive Distillation for Dense Retrieval

Zhenghao Lin , Yeyun Gong , Xiao Liu , Hang Zhang , Chen Lin , Anlei Dong , Jian Jiao , Jingwen Lu , Daxin Jiang , Rangan Majumder

分类：自然语言处理

2022-09-27

知识蒸馏是将知识从强大的教师转移到有效的学生模型的有效方法。理想情况下，我们希望老师越好，学生越好。但是，这种期望并不总是成真。通常，由于教师和学生之间的不可忽略的差距，更好的教师模型通过蒸馏导致不良学生。为了弥合差距，我们提出了一种渐进式蒸馏方法，以进行致密检索。产品由教师渐进式蒸馏和数据进行渐进的蒸馏组成，以逐步改善学生。我们对五个广泛使用的基准，MARCO通道，TREC Passage 19，TREC文档19，MARCO文档和自然问题进行了广泛的实验，其中POD在蒸馏方法中实现了密集检索的最新方法。代码和模型将发布。

translated by 谷歌翻译

Hierarchical Interdisciplinary Topic Detection Model for Research Proposal Classification

Meng Xiao , Ziyue Qiao , Yanjie Fu , Hao Dong , Yi Du , Pengyang Wang , Hui Xiong , Yuanchun Zhou

分类：自然语言处理 | 机器学习

2022-09-16

The peer merit review of research proposals has been the major mechanism for deciding grant awards. However, research proposals have become increasingly interdisciplinary. It has been a longstanding challenge to assign interdisciplinary proposals to appropriate reviewers, so proposals are fairly evaluated. One of the critical steps in reviewer assignment is to generate accurate interdisciplinary topic labels for proposal-reviewer matching. Existing systems mainly collect topic labels manually generated by principal investigators. However, such human-reported labels can be non-accurate, incomplete, labor intensive, and time costly. What role can AI play in developing a fair and precise proposal reviewer assignment system? In this study, we collaborate with the National Science Foundation of China to address the task of automated interdisciplinary topic path detection. For this purpose, we develop a deep Hierarchical Interdisciplinary Research Proposal Classification Network (HIRPCN). Specifically, we first propose a hierarchical transformer to extract the textual semantic information of proposals. We then design an interdisciplinary graph and leverage GNNs for learning representations of each discipline in order to extract interdisciplinary knowledge. After extracting the semantic and interdisciplinary knowledge, we design a level-wise prediction component to fuse the two types of knowledge representations and detect interdisciplinary topic paths for each proposal. We conduct extensive experiments and expert evaluations on three real-world datasets to demonstrate the effectiveness of our proposed model.

translated by 谷歌翻译

The Outcome of the 2022 Landslide4Sense Competition: Advanced Landslide Detection from Multi-Source Satellite Imagery

Omid Ghorbanzadeh , Yonghao Xu , Hengwei Zhao , Junjue Wang , Yanfei Zhong , Dong Zhao , Qi Zang , Shuang Wang , Fahong Zhang , Yilei Shi

分类：计算机视觉

2022-09-06

这里介绍了人工智能研究所（IARAI）组织的2022年Landslide4sense（L4S）竞赛的科学结果。竞争的目的是根据全球收集的卫星图像的大规模多个来源自动检测滑坡。 2022 L4S旨在促进有关使用卫星图像的语义分割任务的深度学习模型（DL）模型最新发展的跨学科研究。在过去的几年中，由于卷积神经网络（CNN）的发展，基于DL的模型已经达到了对图像解释的期望。本文的主要目的是介绍本次比赛中介绍的细节和表现最佳的算法。获胜的解决方案详细介绍了Swin Transformer，Segformer和U-NET等最先进的模型。还考虑了先进的机器学习技术和诸如硬采矿，自我培训和混合数据增强之类的策略。此外，我们描述了L4S基准数据集，以促进进一步的比较，并在线报告准确性评估的结果。可以在\ textIt {未来开发排行榜上访问数据，以供将来评估，\ url {https://www.iarai.ac.ac.at/landslide4sense/challenge/}，并邀请研究人员提交更多预测结果，评估准确性在他们的方法中，将它们与其他用户的方法进行比较，理想情况下，改善了本文报告的滑坡检测结果。

translated by 谷歌翻译